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Abstract 



We prove tight bounds on the time needed to solve k-set agreement. In this prob- 
lem, each processor starts with an arbitrary input value taken from a fixed set, and halts 
after choosing an output value. In every execution, at most k distinct output values may 
be chosen, and every processor's output value must be some processor's input value. 
We analyze this problem in a synchronous, message-passing model where processors 
fail by crashing. We prove a lower bound of [f /k\ + 1 rounds of communication for 
solutions to fc-set agreement that tolerate / failures, and we exhibit a protocol proving 
the matching upper bound. This result shows that there is an inherent tradeoff between 
the running time, the degree of coordination required, and the number of faults toler- 
ated, even in idealized models like the synchronous model. The proof of this result 
is interesting because it is the first to apply topological techniques to the synchronous 
model. 
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1 Introduction 

Most interesting problems in concurrent and distributed computing require processors 
to coordinate their actions in some way. It can also be important for protocols solv- 
ing these problems to tolerate processor failures, and to execute quickly. Ideally, one 
would like to optimize all three properties — degree of coordination, fault-tolerance, 
and efficiency — but in practice, of course, it is usually necessary to make tradeoffs 
among them. In this paper, we give a precise characterization of the tradeoffs required 
by studying a family of basic coordination problems called fc-set agreement. 

In k-set agreement [Cha91], each processor starts with an arbitrary input value and 
halts after choosing an output value. These output values must satisfy two conditions: 
each output value must be some processor's input value, and the set of output val- 
ues chosen must contain at most fc distinct values. The first condition rules out trivial 
solutions in which a single value is hard-wired into the protocol and chosen by all 
processors in all executions, and the second condition requires that the processors co- 
ordinate their choices to some degree. This problem is interesting because it defines 
a family of coordination problems of increasing difficulty. At one extreme, if n is the 
number of processors in the system, then n-set agreement is trivial: each processor 
simply chooses its own input value. At the other extreme, 1-set agreement requires that 
all processors choose the same output value, a problem equivalent to the consensus 
problem [LSP82, PSL80, FL82, FLP85, Dol82, Fis83]. Consensus is well-known to 
be the "hardest" problem, in the sense that all other decision problems can be reduced 
to it. Consensus arises in applications as diverse as on-board aircraft control [W + 78], 
database transaction commit [BHG87], and concurrent object design [Her91]. Between 
these extremes, as we vary the value of k from n to 1, we gradually increase the degree 
of processor coordination required. 

We consider this family of problems in a synchronous, message-passing model with 
crash failures. In this model, n processors communicate by sending messages over a 
completely connected network. Computation in this model proceeds in a sequence 
of rounds. In each round, processors send messages to other processors, then receive 
messages sent to them in the same round, and then perform some local computation 
and change state. This means that all processors take steps at the same rate, and that 
all messages take the same amount of time to be delivered. Communication is reliable, 
but up to / processors can fail by stopping in the middle of the protocol. 

The primary contribution of this paper is a lower bound on the amount of time 
required to solve fc-set agreement, together with a protocol for fc-set agreement that 
proves a matching upper bound. Specifically, we prove that any protocol solving fc- 
set agreement in this model and tolerating / failures requires [f /k\ + 1 rounds of 
communication in the worst case — assuming n > f + k + 1, meaning that there are at 
least fc+ 1 nonfaulty processors — and we prove a matching upper bound by exhibiting a 
protocol that solves fc-set agreement in [_/ /fcj +1 rounds. Since consensus is just 1-set 
agreement, our lower bound implies the well-known lower bound of / + 1 rounds for 
consensus when n > f + 2 [FL82]. More important, the running time r = [f /k\ + 1 
demonstrates that there is a smooth but inescapable tradeoff among the number / of 
faults tolerated, the degree fc of coordination achieved, and the time r the protocol 
must run. For a fixed value of /, Figure 1 shows that 2-set agreement can be achieved 
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r = rounds 




1 2 3 4 5 

k = degree of coordination 

Figure 1: Tradeoff between rounds and degree of coordination. 



in half the time needed to achieve consensus. In addition, the lower bound proof itself 
is interesting because of the geometric proof technique we use, combining ideas due 
to Chaudhuri [Cha91, Cha93], Fischer and Lynch [FL82], Herlihy and Shavit [HS93], 
and Dwork, Moses, and Tuttle [DM90, MT88]. 

In the past few years, researchers have developed powerful new tools based on 
classical algebraic topology for analyzing tasks in asynchronous models (e.g., [AR96, 
BG93, GK96, HR94, HR95, HS93, HS94, SZ93]). 

The principal innovation of these papers is to model computations as simplicial 
complexes (rather than graphs) and to derive connections between computations and 
the topological properties of their complexes. This paper extends this topological ap- 
proach in several new ways: it is the first to derive results in the synchronous model, 
it derives lower bounds rather than computability results, and it uses explicit construc- 
tions instead of existential arguments. 

Although the synchronous model makes some strong (and possibly unrealistic) as- 
sumptions, it is well-suited for proving lower bounds. The synchronous model is a 
special case of almost every realistic model of a concurrent system we can imagine, 
and therefore any lower bound for fc-set agreement in this simple model translates into 
a lower bound in any more complex model. For example, our lower bound holds for 
models that permit messages to be lost, failed processors to restart, or processor speeds 
to vary. Moreover, our techniques may be helpful in understanding how to prove (pos- 
sibly) stricter lower bounds in more complex models. Naturally, our protocol for fc-set 
agreement in the synchronous model does not work in more general models, but it is 
still useful because it shows that our lower bound is the best possible in the synchronous 
model. 

This paper is organized as follows. In Section 2, we give an informal overview 
of our lower bound proof. In Section 3 we define our model of computation, and in 
Section 4 we define fc-set agreement. In Sections 5 through 9 we prove our lower 
bound, and in Section 10 we give a protocol solving fc-set agreement, proving the 
matching upper bound. 



3 



2 Overview 

We start with an informal overview of the ideas used in the lower bound proof. For 
the remainder of this paper, suppose P is a protocol that solves fc-set agreement and 
tolerates the failure of / out of n processors, and suppose P halts in r < [f /fcj +1 
rounds. This means that all nonfaulty processors have chosen an output value at time r 
in every execution of P. In addition, suppose n > f + k + 1, which means that at 
least k + 1 processors never fail. Our goal is to consider the global states that occur at 
time r in executions of P, and to show that in one of these states there are k + 1 pro- 
cessors that have chosen k + 1 distinct values, violating fc-set agreement. Our strategy 
is to consider the local states of processors that occur at time r in executions of P, and 
to investigate the combinations of these local states that occur in global states. This 
investigation depends on the construction of a geometric object. In this section, we use 
a simplified version of this object to illustrate the general ideas in our proof. 

Since consensus is a special case of fc-set agreement, it is helpful to review the stan- 
dard proof of the / + 1 round lower bound for consensus [FL82, DS83, Mer85, DM90] 
to see why new ideas are needed for fc-set agreement. Suppose that the protocol P is 
a consensus protocol, which means that in all executions of P all nonfaulty processors 
have chosen the same output value at time r. Two global states g\ and g 2 at time r are 
said to be similar if some nonfaulty processor p has the same local state in both global 
states. The crucial property of similarity is that the decision value of any processor 
in one global state completely determines the decision value for any processor in all 
similar global states. For example, if all processors decide vingi, then certainly p de- 
cides v in gi. Since p has the same local state in gi and g 2 , and since p's decision value 
is a function of its local state, processor p also decides v in g 2 . Since all processors 
agree with p in g 2 , all processors decide v in g 2 , and it follows that the decision value 
in gi determines the decision value in g 2 . A similarity chain is a sequence of global 
states, gi, - ■■ ,gt, such that g^ is similar to gi+i. A simple inductive argument shows 
that the decision value in gi determines the decision value in gi. The lower bound proof 
consists of showing that all time r global states of P lie on a single similarity chain. It 
follows that all processors choose the same value in all executions of P, independent 
of the input values, violating the definition of consensus. 

The problem with fc-set agreement is that the decision values in one global state do 
not determine the decision values in similar global states. If p has the same local state 
in gi and g 2 , then p must choose the same value in both states, but the values chosen 
by the other processors are not determined. Even if n — 1 processors have the same 
local state in g\ and g 2 , the decision value of the last processor is still not determined. 
The fundamental insight in this paper is that fc-set agreement requires considering all 
"degrees" of similarity at once, focusing on the number and identity of local states 
common to two global states. While this seems difficult — if not impossible — to do us- 
ing conventional graph theoretic techniques like similarity chains, there is a geometric 
generalization of similarity chains that provides a compact way of capturing all degrees 
of similarity simultaneously, and it is the basis of our proof. 

A simplex is just the natural generalization of a triangle to n dimensions: for ex- 
ample, a O-dimensional simplex is a vertex, a 1-dimensional simplex is an edge linking 
two vertices, a 2-dimensional simplex is a solid triangle, and a 3-dimensional simplex 
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Figure 2: Global states for zero, one, and two-round protocols. 



Figure 3: Global states for an r -round protocol (showing the embedded Bermuda Tri- 
angle). 

is a solid tetrahedron. We can represent a global state for an n-processor protocol as 
an (n — 1) -dimensional simplex [Cha93, HS93], where each vertex is labeled with a 
processor id and local state. If g\ and #2 are global states in which p\ has the same 
local state, then we "glue together" the vertices of g\ and #2 labeled with p\. Figure 2 
shows how these global states glue together in a simple protocol in which each of three 
processors repeatedly sends its state to the others. Each process begins with a binary 
input. The first picture shows the possible global states after zero rounds: since no 
communication has occurred, each processor's state consists only of its input. It is easy 
to check that the simplices corresponding to these global states form an octahedron. 
The next picture shows the complex after one round. Each triangle corresponds to a 
failure-free execution, each free-standing edge to a single-failure execution, and so on. 
The third picture shows the possible global states after three rounds. 

The set of global states after an r -round protocol is quite complicated (Figure 3), 
but it contains a well-behaved subset of global states which we call the Bermuda Trian- 
gle B, since all fast protocols vanish somewhere in its interior. The Bermuda Triangle 
(Figure 4) is constructed by starting with a large fc-dimensional simplex, and triangu- 
lating it into a collection of smaller fc-dimensional simplexes. We then label each vertex 
with an ordered pair (p, s) consisting of a processor identifier p and a local state s in 
such a way that for each simplex T in the triangulation there is a global state g con- 
sistent with the labeling of the simplex: for each ordered pair (p, s) labeling a corner 
of T, processor p has local state s in global state g. 

To illustrate the process of labeling vertices, Figure 5 shows a simplified repre- 
sentation of a two-dimensional Bermuda Triangle B. It is the Bermuda Triangle for 
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vertices labeled with 
(processor, local state) 
pairs 



each simplex is 
consistent with a 
global state 



(P,a) (Q,b) 



Figure 4: Bermuda Triangle with simplex representing typical global state. 
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a protocol P for 5 processors solving 2-set agreement in 1 round. We have labeled 
grid points with local states, but we have omitted processor ids and many intermediate 
nodes for clarity. The local states in the figure are represented by expressions such 
as bblaa. Given 3 distinct input values a, b, c, we write bblaa to denote the local state 
of a processor p at the end of a round in which the first two processors have input value b 
and send messages to p, the middle processor fails to send a message to p, and the last 
two processors have input value a and send messages to p. In Figure 5, following any 
horizontal line from left to right across B, the input values are changed from a to 6. 
The input value of each processor is changed — one after another — by first silencing the 
processor, and then reviving the processor with the input value 6. Similarly, moving 
along any vertical line from bottom to top, processors' input values change from b to c. 

The complete labeling of the Bermuda Triangle B shown in Figure 5 — which 
would include processor ids — has the following property. Let (p, s) be the label of 
a grid point x. If a; is a corner of B, then s specifies that each processor starts with the 
same input value, so p must choose this value if it finishes protocol P in local state s. 
If x is on an edge of B, then s specifies that each processor starts with one of the two 
input values labeling the ends of the edge, so p must choose one of these values if it 
halts in state s. Similarly, if x is in the interior of B, then s specifies that each processor 
starts with one of the three values labeling the corners of B, so p must choose one of 
these three values if it halts in state s. 

Now let us "color" each grid point with output values (Figure 6). Given a grid 
point x labeled with (p, s), let us color x with the value v that p chooses in local state s 



8 



2 OVERVIEW 




Figure 6: Sperner's Lemma. 



at the end of P. This coloring of B has the property that the color of each of the cor- 
ners is determined uniquely, the color of each point on an edge between two corners is 
forced to be the color of one of the corners, and the color of each interior point can be 
the color of any corner. Colorings with this property are called Sperner colorings, and 
have been studied extensively in the field of algebraic topology. At this point, we ex- 
ploit a remarkable combinatorial result first proved in 1928: Sperner's Lemma [Spa66, 
p. 15 1] states that any Sperner coloring of any triangulated fc-dimensional simplex must 
include at least one simplex whose corners are colored with all k + 1 colors. In our 
case, however, this simplex corresponds to a global state in which k + 1 processors 
choose k + 1 distinct values, which contradicts the definition of fc-set agreement. Thus, 
in the case illustrated above, there is no protocol for 2-set agreement halting in 1 round. 

We note that the basic structure of the Bermuda Triangle and the idea of coloring the 
vertices with decision values and applying Sperner's Lemma have appeared in previous 
work by Chaudhuri [Cha91, Cha93]. In that work, she also proved a lower bound 
of \_f/k\ + 1 rounds for fc-set agreement, but for a very restricted class of protocols. 
In particular, a protocol's decision function can depend only on vectors giving partial 
information about which processors started with which input values, but cannot depend 
on any other information in a processor's local state, such as processor identities or 
message histories. The technical challenge in this paper is to construct a labeling of 
vertices with processor ids and local states that will allow us to prove a lower bound 
for fc-set agreement for arbitrary protocols. 

Our approach consists of four parts. First, we label points on the edges of B with 
global states. For example, consider the edge between the corner where all processors 
start with input value a and the corner where all processors start with b. We construct 
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a long sequence of global states that begins with a global state in which all processors 
start with a, ends with a global state in which all processors start with b, and in between 
systematically changes input values from a to b. These changes are made so gradually, 
however, that for any two adjacent global states in the sequence, at most one processor 
can distinguish them. Second, we label each remaining point using a combination of 
the global states on the edges. Third, we assign nonfaulty processors to points in such 
a way that the processor labeling a point has the same local state in the global states 
labeling all adjacent points. Finally, we project each global state onto the associated 
nonfaulty processor's local state, and label the point with the resulting processor-state 
pair. 

3 The Model 

We use a synchronous, message-passing model with crash failures. The system con- 
sists of n processors, pi, . . . ,p n . Processors share a global clock that starts at 0 and 
advances in increments of 1. Computation proceeds in a sequence of rounds, with 
round r lasting from time r — 1 to time r. Computation in a round consists of three 
phases: first each processor p sends messages to some of the processors in the sys- 
tem, possibly including itself, then it receives the messages sent to it during the round, 
and finally it performs some local computation and changes state. We assume that the 
communication network is totally connected: every processor is able to send distinct 
messages to every other processor in every round. We also assume that communication 
is reliable (although processors can fail): if p sends a message to q in round r, then the 
message is delivered to q in round r. 

Processors follow a deterministic protocol that determines what messages a pro- 
cessor should send and what output a processor should generate. A protocol has two 
components: a message component that maps a processor's local state to the list of 
messages it should send in the next round, and an output component that maps a pro- 
cessor's local state to the output value (if any) that it should choose. Processors can be 
faulty, however, and any processor p can simply stop in any round r. In this case, pro- 
cessor p follows its protocol and sends all messages the protocol requires in rounds 1 
through r — 1, sends some subset of the messages it is required to send in round r, and 
sends no messages in rounds after r. We say that p is silent from round r if p sends 
no messages in round r or later. We say that p is active through round r if p sends all 
messages in round r and earlier. 

A full-information protocol is one in which every processor broadcasts its en- 
tire local state to every processor, including itself, in every round [PSL80, FL82, 
Had83]. One nice property of full-information protocols is that every execution of 
a full-information protocol P has a compact representation called a communication 
graph [MT88]. The communication graph Q for an r-round execution of P is a two- 
dimensional two-colored graph. The vertices form an n x r grid, with processor 
names 1 through n labeling the vertical axis and times 0 through r labeling the hor- 
izontal axis. The node representing processor p at time i is labeled with the pair (p, i). 
Given any pair of processors p and q and any round i, there is an edge between (p, i — 1) 
and (q, i) whose color determines whether p successfully sends a message to q in 
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red 
green 



Figure 7: A three-round communication graph. 

round i: the edge is green if p succeeds, and red otherwise. In addition, each node (p, 0) 
is labeled with p's input value. Figure 7 illustrates a three round communication graph. 
In this figure, green edges are denoted by solid lines and red edges by dashed lines. 
We refer to the edge between (p, i — 1) and (q, i) as the round i edge from p to q, and 
we refer to the node (p, i — 1) as the round i node for p since it represents the point 
at which p sends its round i messages. We define what it means for a processor to be 
silent or active in terms of communication graphs in the obvious way. 

In the crash failure model, a processor is silent in all rounds following the round in 
which it stops. This means that all communication graphs representing executions in 
this model have the property that if a round i edge from p is red, then all round j > i + 1 
edges from p are red, which means that p is silent from round i + 1. We assume that all 
communication graphs in this paper have this property, and we note that every r-round 
graph with this property corresponds to an r-round execution of P. 

Since a communication graph Q describes an execution of P, it also determines 
the global state at the end of P, so we sometimes refer to Q as a global communica- 
tion graph. In addition, for each processor p and time t, there is a subgraph of Q that 
corresponds to the local state of p at the end round t, and we refer to this subgraph as 
a local communication graph. The local communication graph for p at time t is the 
subgraph Q (p, t) of Q containing all the information visible to p at the end of round t. 
Namely, Q(p, t) is the subgraph induced by the node (p, t) and all earlier nodes reach- 
able from (p, t) by a sequence (directed backwards in time) of green edges followed by 
at most one red edge. In the remainder of this paper, we use graphs to represent states. 
Wherever we used "state" in the informal overview of Section 2, we now substitute the 
word "graph." Furthermore, we defined a full-information protocol to be a protocol in 
which processors broadcast their local states in every round, but we now assume that 
processors broadcast their local communication graphs instead. In addition, we assume 
that all executions of a full-information protocol run for exactly r rounds and produce 
output at exactly time r. All local and global communication graphs are graphs at 
time r, unless otherwise specified. 

The crucial property of a full-information protocol is that every protocol can be 
simulated by a full-information protocol, and hence that we can restrict attention to 
full-information protocols when proving the lower bound in this paper: 

Lemma 1: If there is an n-processor protocol solving fc-set agreement with / fail- 
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ures in r rounds, then there is an n-processor full-information protocol solving fc-set 
agreement with / failures in r rounds. 

4 The fc-set Agreement Problem 

The k-set agreement problem [Cha91] is defined as follows. We assume that each 
processor pi has two private registers in its local state, a read-only input register and a 
write-only output register. Initially, p,'s input register contains an arbitrary input value 
from a set V containing at least fc + 1 values vq, . . . , Vk, and its output register is empty. 
A protocol solves the problem if it causes each processor to halt after writing an output 
value to its output register in such a way that 

1. every processor's output value is some processor's input value, and 

2. the set of output values chosen has size at most k. 

5 Bermuda Triangle 

In this section, we define the basic geometric constructs used in our proof that every 
protocol P solving fc-set agreement and tolerating / failures requires at least [f /k\ + 1 
rounds of communication, assuming n > f + k + 1. 

We start with some preliminary definitions. A simplex S is the convex hull of fc + 1 
affinely-independent 1 points xo,...,Xk in Euclidean space. It is a fc-dimensional 
volume, the fc-dimensional analogue of a solid triangle or tetrahedron. The points 
xq,.. . ,Xk are called the vertices of S, and fc is the dimension of S. We sometimes 
call S a k-simplex when we wish to emphasize its dimension. A simplex F is a face 
of S if the vertices of F form a subset of the vertices of S (which means that the di- 
mension of F is at most the dimension of S). A set of fc-simplexes Si, . . . ,Si is a 
triangulation of S if S = Si U • • • U Si and the intersection of Sj and Sj is a face of 
each 2 for all pairs i and j. The vertices of a triangulation are the vertices of the S,. Any 
triangulation of S induces triangulations of its faces in the obvious way. 

The construction of the Bermuda Triangle is illustrated in Figure 8. Let B be the fc- 
simplex in fc-dimensional Euclidean space with vertices 

(0,...,0),(7V,0,...,0),(7V,7V,0,...,0),...,(7V,...,7V), 

where TV is a huge integer defined later in Section 6.3. The Bermuda Triangle B is a 
triangulation of B defined as follows. The vertices of B are the grid points contained 
in B: these are the points of the form x = (xi, . . . , xu), where the Xi are integers 
between 0 and N satisfying x\ > X2 > ■ ■ ■ > Xk- 

Informally, the simplexes of the triangulation are defined as follows: pick any grid 
point and walk one step in the positive direction along each dimension (Figure 9). 

1 Points xq , . . . , Xk are affinely independent if xi — xo , ■ ■ ■ , Xu — xo are linearly independent. 
2 Notice that the intersection of two arbitrary fc-dimensional simplexes Si and Sj will be a volume of 
some dimension, but it need not be a face of either simplex. 
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5 BERMUDA TRIANGLE 




Figure 9: Simplex generation in Kuhn's triangulation. 
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The k + 1 points visited by this walk define the vertices of a simplex, and the trian- 
gulation B consists of all simplexes determined by such walks. For example, the 2- 
dimensional Bermuda Triangle is illustrated in Figure 5. This triangulation, known as 
Kuhn's triangulation, is defined formally as follows [Cha93]. Let e±, . . . , be the 
unit vectors; that is, e, is the vector (0, . . . , 1, . . . , 0) with a single 1 in the ith coordi- 
nate. A simplex is determined by a point y 0 and an arbitrary permutation /i, . . . , /* of 
the unit vectors e±, . . . , e^: the vertices of the simplex are the points yi = yi-\ + ft 
for all i > 0. When we list the vertices of a simplex, we always write them in the 
order y 0 , . . . , y^ in which they are visited by the walk. 

For brevity, we refer to the vertices of B as the corners of B. The "edges" of B are 
partitioned to form the edges of B. More formally, the triangulation B induces triangu- 
lations of the one-dimensional faces (line segments connecting the vertices) of B, and 
these induced triangulations are called the edges of B. The simplexes of B are called 
primitive simplexes. 

Each vertex of B is labeled with an ordered pair (p, £) consisting of a processor id p 
and a local communication graph £. As illustrated in the overview in Section 2, the cru- 
cial property of this labeling is that if 5 is a primitive simplex with vertices y 0 , . . . , yk, 
and if each vertex yi is labeled with a pair (g,, £,), then there is a global communica- 
tion graph Q such that each qi is nonfaulty in Q and has local communication graph £ , 
in Q. Constructing this labeling is the subject of the next three sections. We first assign 
global communication graphs Q to vertices in Section 6, then we assign processors p to 
vertices in Section 7, and then we assign ordered pairs (p, £) to vertices in Section 8, 
where £ is the local communication graph of p in Q. 

6 Graph Assignment 

In this section, we label each vertex of B with a global communication graph. Actually, 
for expository reasons, we augment the definition of a communication graph and label 
vertices of B with these augmented communication graphs instead. Constructing this 
labeling involves several steps. We define operations on augmented communication 
graphs that make minor changes in the graphs, and we use these operations to construct 
long sequences of graphs. Then we label vertices along edges of B with graphs from 
these sequences, and we label interior vertices of B by performing a merge of the 
graphs labeling the edges. 

6.1 Augmented Communication Graphs 

We extend the definition of a communication graph to make the processor assignment 
in Section 7 easier to describe. We augment communication graphs with tokens, and 
place tokens on the graph so that if processor p fails in round i, then there is a token 
on the node (p, j — 1) for processor p in some earlier round j < i (Figure 10). In this 
sense, every processor failure is "covered" by a token, and the number of processors 
failing in the graph is bounded from above by the number of tokens. In the next few 
sections, when we construct long sequences of these graphs, tokens will be moved be- 
tween adjacent processors within a round, and used to guarantee that processor failures 
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red 

green 

token 0 



Figure 10: Three -round communication graph with one token per round. 

in adjacent graphs change in a orderly fashion. For every value of I, we define graphs 
with exactly I tokens placed on nodes in each round, but we will be most interested in 
the two cases with I equal to 1 and k. 

For each value I > 0, we define an l-graph Q to be a communication graph with 
tokens placed on the nodes of the graph that satisfies the following conditions for each 
round i, 1 < i < r: 

1. The total number of tokens on round i nodes is exactly I. 

2. If a round i edge from p is red, then there is a token on a round j < i node for p. 

3. If a round i edge from p is red, then p is silent from round i + 

We say that p is covered by a round i token if there is a token on the round i node for p, 
we say that p is covered in round i if p is covered by a round j < i token, and we 
say that p is covered in a graph if p is covered in any round. Similarly, we say that a 
round i edge from p is covered if p is covered in round i. The second condition says 
every red edge is covered by a token, and this together with the first condition implies 
that at most Ir processors fail in an £-graph. We often refer to an £-graph as a graph 
when the value of I is clear from context or unimportant. We emphasize that the tokens 
are simply an accounting trick, and have no meaning as part of the global or local state 
in the underlying communication graph. 

We define a failure-free £-graph to be an £-graph in which all edges are green, and 
all round i tokens are on processor pi in all rounds i. 

6.2 Graph operations 

We now define four operations on augmented graphs that make only minor changes to 
a graph. In particular, the only change an operation makes is to change the color of 
a single edge, to change the value of a single processor's input, or to move a single 
token between adjacent processors within the same round. The operations are defined 
as follows (see Figure 1 1): 

1. delete(i,p, q): This operation changes the color of the round i edge from p to q 
to red, and has no effect if the edge is already red. This makes the delivery of the 
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round i message from p to q unsuccessful. It can only be applied to a graph if p 
and q are silent from round i + 1, and p is covered in round i. 

2. add(i, p, q): This operation changes the color of the round i edge from p to q to 
green, and has no effect if the edge is already green. This makes the delivery of 
the round i message from p to q successful. It can only be applied to a graph if p 
and q are silent from round i + 1, processor p is active through round i — 1, and p 
is covered in round i. 

3. change(p, v): This operation changes the input value for processor p to v, and 
has no effect if the value is already v. It can only be applied to a graph if p is 
silent from round 1, and p is covered in round 1. 

4. move(i,p, q): This operation moves a round i token from (p, i — 1) to (q, i — 1), 
and is defined only for adjacent processors p and q (that is, {p, q} = {pj,pj + i} 
for some j). It can only be applied to a graph if p is covered by a round i token, 
and all red edges are covered by other tokens. 

It is obvious from the definition of these operations that they preserve the property of 
being an £-graph: if Q is an £-graph and r is a graph operation, then t(Q) is an £-graph. 
We define delete, add, and change operations on communication graphs in exactly the 
same way, except that the condition "p is covered in round i" is omitted. 

6.3 Graph sequences 

We now define a sequence a[v] of graph operations that can be applied to any failure- 
free graph Q to transform it into the failure-free graph Q[v] in which all processors 
have input v. We want to emphasize that the sequences <r[v] differ only in the value v. 
For this reason, we define a parameterized sequence <r[v] with the property that for all 
values v and all graphs Q, the sequence <j[v] transforms Q to Q[v]. In general, we define 
a parameterized sequence <r[x i , . . . , Xi] to be a sequence of graph operations with free 
variables Xi, . . . , X? appearing as parameters to the graph operations in the sequence. 

Given a graph Q, let red(G,p,m) and green(G,p,m) be graphs identical to Q ex- 
cept that all edges from p in rounds m, . . . , r are red and green, respectively. We define 
these graphs only if 

1 . p is covered in round m in Q, 

2. all faulty processors are silent from round m (or earlier) in Q, and 

3. and all tokens are on p\ in rounds m + 1, . . . , r in Q. 
In addition, we define the graph green(Q,p, m) only if 

4. p is active through round m — 1 in Q. 

These restrictions guarantee that if Q is an £-graph and red(G, p, m) and green(G, p, m) 
are defined, then red(G, p, m) and green(G, p, m) are both ^-graphs. 
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In the case of ordinary communication graphs, a result by Moses and Tuttle [MT88] 
implies that there is a "similarity chain" of graphs between Q and red(G, p, m) and be- 
tween Q and green(G, p, m). In their proof — a refinement of similar proofs by Dwork 
and Moses [DM90] and others — the sequence of graphs they construct has the property 
that each graph in the chain can be obtained from the preceding graph by applying a 
sequence of the add, delete, and change graph operations defined above. The same 
proof works for augmented communication graphs, provided we insert move opera- 
tions between the add, delete, and change operations to move tokens between nodes 
appropriately. With this modification, we can prove the following. Let faulty{Q) be the 
set of processors that fail in Q. 

Lemma 2: For every processor p, round m, and set tt of processors, there are se- 
quences silence„(p, m) and revive„(p, m) such that for all graphs Q: 

1. If red(Q,p, m) is defined and tt = faulty{Q), then 

silence n {p,m){Q) = red{Q,p,m). 

2. If green(Q, p, m) is defined and tt = faulty(Q), then 

revive n (p,m)(Q) = green(Q,p,m). 

Proof: We proceed by reverse induction on m. Suppose m = r. Define 

silence„(p,r) = delete(r,p,pi) ■ ■ ■ delete(r,p,p n ) 
revive n (p,r) = add(r,p,px) ■ ■ ■ add(r,p,p n ). 

For part 1, let Q be any graph and suppose red(G,p,r) is defined. For each i with 
0 < i < n, let Qi be the graph identical to Q except that the round r edges from p 
to pi , . . . , pi are red. Since red(G, p, r ) is defined, condition 1 implies that p is covered 
in round r in Q. For each i with 1 < i < n, it follows that Qi-i is really a graph, 
and delete(r,p,pi) can be applied to Qi-i and transforms it to Qi. Since Q = Q 0 
and G n = red(G,p,r), it follows that silence „(p,r) transforms Q to red(G,p,r). For 
part 2, let Q be any graph and suppose green(Q, p, r) is defined. The proof of this part 
is the direct analogue of the proof of part 1 . The only difference is that since we are 
coloring round r edges from p green instead of red, we must verify that p is active 
through round r — 1 in Q, but this follows immediately from condition 4. 

Suppose m < r and the induction hypothesis holds for m + 1. Define n' = ttU {p} 
and define 

set(m + l,Pi) = move(m + l,pi,p2) ■ ■ ■ move(m + l,Pi-i,Pi) 
reset(m + l,Pi) = move(m + l,Pi,Pi-i) ■ ■ ■ move(m + l,p2,pi). 

The set function moves the token from pi to pi and the reset function moves the token 
back from pi to pi . 
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Define block(m,p,pi) to be delete(m,p,pi) if pi G -k', and otherwise 

setim + I, Pi) 

silence^ (pi,m + 1) delete (m,p,p,) revive „-'u-{>;}(Pi; m + 1) 
reset(m + l,Pi). 

Define unblock(m,p,pi) to be add(m,p,pi) if p, G 7r', and otherwise 

setim + I, Pi) 

silence^ (pi,m + 1) add(m,p,Pi) rev;Ve 7r / u .r J ,.}.(pi, m + 1) 
reset(m + l,Pi). 

Finally, define 



silence „(p, m) = silence ^(p, m + 1) block(m, p) 
revive„(p, m) = silence ^{p, m + 1) unblock(m, p) revive^ (p, m + 1). 

For part 1, let £? be any graph, and suppose red(G,p,m) is defined and 7r = 
faultyiff). Since red(G,p,m) is defined, the graph red(G,p,m + 1) is also defined, 
and the induction hypothesis for m + 1 states that silence m + 1) transforms £ 
to red(G,p,m + 1). We now show that block(m,p) transforms red(Q,p,m + 1) 
to red(S,p,m), and we will be done. For each « with 0 < i < n, let C/j be the 
graph identical to Q except that p is silent from round m + 1 and the edges from p 
to pi, . . . ,pi are red in (/,. Since red(G,p, m) is defined, condition 1 implies that p is 
covered in round m in C?. For each i with 0 < i < n, it follows that Qi really is a graph 
and that tt' = faulty{Qi). Since red(Q,p,m + 1) = Go an d Gn = red(G,p,m), it is 
enough to show that block(m,p,pi) transforms Gi-i to Gi for each i with 1 < i < n. 
The proof of this fact depends on whetherp, G n', so we consider two cases. 

Consider the easy case with p, G tt' . We know that p is covered in round m in Gi-i 
since it is covered in G by condition 1 . We know that p is silent from round m + 1 
in Gi-i since it is silent in Go = red(G,p,m + 1). We know that p^ is silent from 
round m + 1 in Gi-i since pi G V implies (assuming that p, is not just p again) that p, 
fails in G, and hence is silent from round m + 1 in G by condition 2. This means 
that block(m,p,pi) = delete(m,p,pi) can be applied to Gi-i to transform to Gi- 

Now consider the difficult case when pi 0 7r'. Let"Hj_i and?^ be graphs identical 
to and Gi, except that a single round m + 1 token is on p, in and Hi. 

Condition 3 guarantees that all round m + 1 tokens are on pi in G, and hence in Gi-i 
and (/,, so and really are graphs. In addition, set(m + l,Pj) transforms £?j_i 
to and reset(m+l, pi) transforms Hi to (/,. Let2j_i andX, be identical to%,_i 

and Hi except that p, is silent from round m + 1 inZi_i andZ^. Processor p, is covered 
in round m + 1 in T^-i and so and X, really are graphs. In fact, pi does 
not fail in G since p, ^ tt', so p, is active through round m in X,_i and X,, so X,_i = 
red(Hi-i,pi,m+l) and'H, = green(li,pi, m+1). The inductive hypothesis for m+1 



blockim, p) 
unblock(m,p) 



block(m,p,pi) ■ ■ ■ block(m,p,p n ) 
unblock(m,p,pi) ■ ■ ■ unblock{m,p,p n ) 



and define 
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states that silence^ (pi,m + 1) transforms Hi-\ to Xj_i, and revive w 'u{pi}{Pii m + 1) 
transforms Xj to Hi- Finally, notice that the only difference between X,_i and X, is the 
color of the round m edge from p to p , . Since p is covered in round m and p and p , are 
silent from round m + 1 in both graphs, we know that delete(m,p,pi) transforms X,_i 
toX,. It follows that block(m,p,pi) transforms Qi-\ to Q it and we are done. 

For part 2, let Q be any graph and suppose green(Q,p,m) is defined and n = 
faultyiff). Since green(Q,p, m) is defined, let C?' = green(Q,p, m). Now let T-L and 
be graphs identical to Q and £?' except that p is silent from round m + 1 in H and 
Since green(Q, p, m) is defined, processor p is covered in round m in C? by condition 1 
and hence in Q' , so T-L and really are graphs. In addition, since green(Q,p,m) is 
defined, processor p is active through round m — 1 in Q by condition 4, so processor p 
is active through round m in and This means that green(7i' ,p,m + 1) is de- 
fined, and in fact we have T-L = red(Q,p, m + 1) and Q' = green^H! ,p, m + 1). The 
induction hypothesis for m + 1 states that silence ^{p^m + 1) transforms Q to % and 
that rev;'v<v (p, m + 1) transforms H' to To complete the proof, we need only show 
that unblock(m, p) transforms H to T-L' . The proof of this fact is the direct analogue of 
the proof in part 1 that block(m,p) transforms red(Q,p,m + 1) to red(Q,p, m). The 
only difference is that since we are coloring round m edges from p with green instead 
of red, we must verify that p is active through round m — 1 in the graphs T-L i analogous 
to Qi in the proof of part 1, but this follows immediately from condition 4. □ 

Given a graph Q, let Qi[v] be a graph identical to £?, except that processor pi has 
input v. Using the preceding result, we can transform Q to Qi[v}. 

Lemma 3: For each i, there is a parameterized sequence <7j[v] with the property that 
for all values v and failure-free graphs Q, the sequence <Ji[v] transforms Q to Qi[v\. 

Proof: Define 

set(l,pi) = move(l,pi,p 2 )---move(l,pi_i,pi) 
reset(l,pi) = move(l,p i ,p i _i) • • • move(l,p 2 ,Pi) 

and define 

<Tj[v] = set(l,pi)silence$(pi, l)change(pi, v) revive { Pi }(pi, l)reset(l,Pi) 

where 0 denotes the empty set. Now consider any value v and any failure-free graph Q, 
and let Q' = Qi [v] . Since Q and Q' are failure-free graphs, all round 1 tokens are on p i , 
so let 7i and 7i' be graphs identical to Q and Q' except that a single round 1 token is 
on in 7i and 7i' . We know that 7i and 7i' are graphs, and that set(l,pi) transforms Q 
to H and reset(l,pi) transforms H' to Q' . Since p, is covered in H and H' , let X 
and X' be identical to T-L and T-L' except that p, is silent from round 1. We know that X 
and X' are graphs, and it follows by Lemma 2 that silence ^(pi, 1) transforms 7i to X 
and that rev/ve{ Pi } (p«, 1) transforms X' to Finally, notice that X and X' differ only 
in the input value for p,. Since pi is covered and silent from round 1 in both graphs, 
the operation change(pi, v) can be applied to X and transforms it to X'. Stringing all of 
this together, it follows that <r,[v] transforms Q to Q' = Gi[v]- □ 
By concatenating such operation sequences, we can transform Q into Q[v] by chang- 
ing processors' input values one at a time: 
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Lemma 4: Let <r[v] = a\ [v] ■ ■ ■ <r n [v] . For every value v and failure-free graph Q, the 
sequence a[v] transforms Q to Q[v]. 

Now we can define the parameter TV used in defining the shape of B: TV is the length 
of the sequence <r[v], which is exponential in r. 

6.4 Graph merge 

Speaking informally, we will use each sequence <r[u,] of graph operations to generate 
a sequence of graphs, and we will use this sequence of graphs to label vertices along 
the edge of B in the ith dimension. Then we will label vertices in the interior of B by 
performing a "merge" of the graphs on the edges in the different dimensions. 
The merge of a sequence "Hi, . . . , Tik of graphs is a graph defined as follows: 

1. an edge e is colored red if it is red in any of the graphs 7i i, . . . , 7ik, and green 
otherwise, and 

2. an initial node (p, 0) is labeled with the value Vi where i is the maximum index 
such that (p, 0) is labeled with Vi in or vo if no such i exists, and 

3. the number of tokens on a node (p, i) is the sum of the number of tokens on the 
node in the graphs H i , . . . , Hk ■ 

The first condition says that a message is missing in the resulting graph if and only 
if it is missing in any of the merged graphs. To understand the second condition, 
notice that for each processor pj there is a integer sj with the property that p^'s input 
value in changed to V{ by the s^th operation appearing in <j[vi]. Now choose a vertex 
x = (xi, . . . , Xk) of B, and imagine walking from the origin to x by walking along the 
first dimension to (x±, 0, . . . , 0), then along the second dimension to (x i, X2, 0, . . . , 0), 
and so forth. In each dimension i, processor pj's input is changed from to Vi 
after sj steps in this dimension. Since xi > a; 2 > ■ ■ ■ > Xk, there is a final dimension i 
in which pj's input is changed to Vi, and never changed again. The second condition 
above is just a compact way of identifying this final value v j. 

Lemma 5: Let 7i be the merge of the graphs Ti 1 , . . . , Tik- If • • • , Hk are 1-graphs, 
then 7i is a fc-graph. 

Proof: We consider the three conditions required of a fc-graph in turn. First, there are fc 
tokens in each round of H since there is 1 token in each round of each graph H 1 , . . . , 
Second, every red edge in H is covered by a token since every red edge in H corre- 
sponds to a red edge in one of the graphs H. j , and this edge is covered by a token inHj. 
Third, if there is a red edge from p in round i'ml-L, then there is a red from p in round i 
of one of the graphs Tij. In this graph, p is silent from round i + 1, so the same is true 
in 7i. Thus, 7i is a fc-graph. □ 
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6.5 Graph assignments 

Now we can define the assignment of graphs to vertices of B. For each value v let T{ 
be the failure-free 1-graph in which all processors have input v , . Let x = (xi , . . . , Xk ) 
be an arbitrary vertex of B. For each coordinate x j, let <jj be the prefix of <r[vj] consist- 
ing of the first xj operations, and let Tij be the 1-graph resulting from the application 
of (Tj to J-j-i- This means that in some setpi, . . . ,pi of adjacent processors have 
had their inputs changed from vj-i to vj. The graph Q labeling x is defined to be the 
merge of ■ ■ ■ , 7ik- We know that Q is a fc -graph by Lemma 5, and hence that at 
most rk < f processors fail in Q. 

Remember that we always write the vertices of a primitive simplex in a canonical 
order yo, . . . , yk- In the same way, we always write the graphs labeling the vertices] of 
a primitive simplex in the canonical order Q 0 , . . . , Qk, where Qi is the graph labeling ?/,. 

6.6 Graphs on a simplex 

The graphs labeling the vertices of a primitive simplex have some convenient proper- 
ties. For this section, fix a primitive simplex S, and let yo, . . . , yk be the vertices of S 
and let Q 0 , ... ,Qk be the graphs labeling the corresponding vertices of S. Our first re- 
sult says that any processor that is uncovered at a vertex of S is nonfaulty at all vertices 
of 5. 

Lemma 6: If processor q is not covered in the graph labeling a vertex of 5, then q is 
nonfaulty in the graph labeling every vertex of S. 

Proof: Let y 0 = (ai, . . . , a*) be the first vertex of S. For each i, let <r, and <T,Tj be the 
prefixes of <r[vi] consisting of the first a, and a, + 1 operations, and let T-L , and H\ be the 
result of applying <r, and <t,t, to Ti-\. For each i, we know that the graph Qi labeling 
the vertex yi of S is the merge of graphs I{, . . . ,T\ where Xj is either 7ij or Ji'y 
Suppose q is faulty in Qi. Then q must be faulty in some graph Xj in the sequence of 
graphs T{, . . . , T l n merged to form Qi, so q must fail in one of the graphs 7ij or Ji'y 
Since Oj and (TjTj are prefixes of <r[vj], it is easy to see from the definition of cr[vj] 
that the fact that q fails in one of the graphs H j and Tij implies that q is covered in 
both graphs. Since one of these graphs is contained in the sequence of graphs merged 
to form Q a for each a, it follows that q is covered in each Q a . This contradicts the fact 
that q is uncovered in a graph labeling a vertex of S. □ 

Our next result shows that we can use the bound on the number of tokens to bound 
the number of processors failing at any vertex of S. 

Lemma 7: If is the set of processors failing in Qi and F = UjFj, then |F| < rk < 
/■ 

Proof: If q G F, then q £ Fi for some i and q fails in Qi, so q is covered in every graph 
labeling every vertex of S by Lemma 6. It follows that each processor in F is covered 
in each graph labeling S. Since there are at most rk tokens to cover processors in any 
graph, there are at most rk processors in F. □ 
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We have assigned graphs to S, and now we must assign processors to S. A lo- 
cal processor labeling of S is an assignment of distinct processors q 0 , . . . , q k to the 
vertices yo,-- ■ ,Uk of S so that qi is uncovered in Qi for each y,. A global processor 
labeling of B is an assignment of processors to vertices of B that induces a local pro- 
cessor labeling at each primitive simplex. The final important property of the graphs 
labeling S is that if we use a processor labeling to label S with processors, then S is 
consistent with a single global communication graph. The proof of this requires a few 
preliminary results. 

Lemma 8: If Qi-i and Qi differ in p's input value, then p is silent from round 1 
in Q 0 , . . . , Qk- If Qi-i and Qi differ in the color of an edge from q to p in round t, 
then p and q are silent from round t + 1 in Q 0 , . . . , Qf,- 

Proof: Suppose the two graphs Qi-i and Qi labeling vertices and yi differ in 
the input to p at time t = 0 or in the color of an edge from q to p in round t. 
The vertices differ in exactly one coordinate j, so yi-i = (a±, . . . ,a,j, . . . , au) and 
yi = (ai, . . . , a.,- + 1, . . . , afc). For each £, let 07 be the prefix of <t[vi] consisting of 
the first a 1 operations, and let Ji® be the result of applying 07 to Tt-\. Furthermore, 
in the special case of I = j, let <7jTj be the prefix of <r[vj] consisting of the first aj + 1 
operations, and let 7ij be the result of applying (TjTj to Tj-\- 

We know that C/j_i is the merge of Til, . . . , H®, . . . , Ti° k , and that Qi is the merge 
of nl, ...,nj,...,n° k . If and %) are equal, then Q { _ x and & are equal. Thus, 7i° 
and % j must differ in the input to p at time t = 0 or in the color of an edge between q 
and p in round t, exactly as Qi-i and Qi differ. Since and Tij are the result of 
applying <jj and <r j tj to Tj-i, this change at time i must be caused by the operation tj . 
It is easy to see from the definition a graph operation like tj that (1) if r, changes p's 
input value, then p is silent from round linTi® and Jij, and (2) if r, changes the color 
of an edge from q to p in round t, then p and q are silent from round t + 1 in ° and j . 
Consequently, the same is true in the merged graphs Qi-i and Qi. □ 

Lemma 9: IfQi-i and £ j differ in the local communication graph of p at time t, then p 
is silent from round t + 1 in £ 0 ; • • • , Qk- 

Proof: We proceed by induction on t. If t = 0, then the two graphs must differ in 
the input to p at time 0, and Lemma 8 implies that p is silent from round 1 in the 
graphs Qo,-- ■ ,Qk labeling the simplex. Suppose t > 0 and the inductive hypothesis 
holds for t — 1. Processor p's local communication graph at time t can differ in the 
two graphs for one of two reasons: either p hears from some processor q in round t in 
one graph and not in the other, or p hears from some processor q in both graphs but q 
has different local communication graphs at time t — 1 in the two graphs. In the first 
case, Lemma 8 implies that p is silent from round t + 1 in the graphs Qo , . . . , Qk . In the 
second case, the induction hypothesis for t — 1 implies that q is silent from round t in 
the graphs Q 0 , . .. ,Q k . In particular, g is silent in round t in Qi-i and Q i7 contradicting 
the assumption that p hears from q in round t in both graphs, so this case can't happen. 

□ 
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Lemma 10: If p sends a message in round r in any of the graphs Go, . . . , Gk, then p 
has the same local communication graph at time r — 1 in all of the graphs Go, - ■ ■ ,Gk- 

Proof: If p has different local communication graphs at time r — 1 in two of the 
graphs Go, ■ ■ ■ , Gk, then there are two adjacent graphs Gi-i and Gi in which p has 
different local communication graphs at time r — 1. By Lemma 9, p is silent in round r 
in all of the graphs Go, ■ ■ ■ , Gk, contradicting the assumption that p sent a round r 
message in one of them. □ 
Finally, we can prove the crucial property of primitive simplexes in the Bermuda 
Triangle: 

Lemma 11: Given a local processor labeling, let qo, ■ ■ ■ , qk be the processors labeling 
the vertices of 5, and let £j be the local communication graph of g, in Gi- There is a 
global communication graph G with the property that each q j is nonfaulty in G and has 
the local communication graph £, in G- 

Proof: Let Q be the set of processors that send a round r message in any of the 
graphs Go, ■ ■ ■ , Gk- Notice that this set includes the uncovered processors q 0 , ■ ■ ■ , qk, 
since Lemma 6 says that these processors are nonfaulty in each of these graphs. For 
each processor q £ Q, Lemma 10 says that q has the same local communication graph 
at time r — 1 in each graph Go, ■ ■ ■ ,Gk- 

Let H be the global communication graph underlying any one of these graphs. 
Notice that each processor q £ Q is active through round r — 1 in H. To see this, notice 
that since q sends a message in round r in one of the graphs labeling 5, it sends all 
messages in round r — 1 in that graph. On the other hand, if q fails to send a message in 
round r — 1 in H, then the same is true for the corresponding graph labeling S. Thus, 
there are adjacent graphs Gi-i and Gi labeling S where p sends a round r — 1 message 
in one and not in the other. Consequently, Lemma 8 says q is silent in round r in all 
graphs labeling 5, but this contradicts the fact that q does send a round r message in 
one of these graphs. 

Now let G be the global communication graph obtained from 7i by coloring green 
each round r edge from each processor q £ Q, unless the edge is red in one of the local 
communication graphs Co, ■ ■ ■ , C-k m which case we color it red in G as well. Notice 
that since the processors q £ Q are active through round r — linJi, changing the color 
of a round r edge from a processor q £ Q to either red or green is acceptable, provided 
we do not cause more that / processors to fail in the process. Fortunately, Lemma 7 
implies that there are at least n — rk > n — f processors that do not fail in any of the 
graphs Go, ■ ■ ■ ,Gk- This means that there is a set of n — f processors that send to every 
processor in round r of every graph Gi, and in particular that the round r edges from 
these processors are green in every local communication graph £ It follows that for 
at least n — f processors, all round r edges from these processors are green in G, so at 
most / processors fail in G- 

Each processor g, is nonfaulty in G, since g, is nonfaulty in each Go, ■ ■ ■ , Gk, mean- 
ing each edge from g, is green in each Go, - ■ ■ ,Gk and Cq, . . . , £k> and therefore in G- 
In addition, each processor g, has the local communication graph £, in G- To see this, 
notice that £, consists of a round r edge from pj to g, for each j, and the local com- 
munication graph for pj at time r — 1 if this edge is green. This edge is green in £, 
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if and only if it is green in Q. In addition, if this edge is green in then it is green 
in Qi. In this case, Lemma 10 says that pj has the same local communication graph at 
time r — 1 in each graph Q 0 , . . . , Qk, and therefore in Q. Consequently, g, has the local 
communication graph Ci'mQ. □ 



7 Processor Assignment 

What Lemma 11 at the end of the preceding section tells us is that all we have left to 
do is to construct a global processor labeling. In this section, we show how to do this. 
We first associate a set of "live" processors with each communication graph labeling a 
vertex of B, and then we choose one processor from each set to label vertices of B. 

7.1 Live processors 

Given a graph Q, we construct a set of c = n — rk > k + 1 uncovered (and hence 
nonfaulty) processors. We refer to these processors as the live processors in G, and we 
denote this set by live(Q). These live sets have one crucial property: if Q and Q ' are two 
graphs labeling adjacent vertices, and if p is in both live(Q) and live(Q'), then p has the 
same rank in both sets. As usual, we define the rank of p, in a set R of processors to 
be the number of processors pj £ R with j < i. 

Given a graph Q, we now show how to construct live(Q). This construction has one 
goal: if Q and Q' are graphs labeling adjacent vertices, then the construction should 
minimize the number of processors whose rank differs in the sets live{Q) and live{Q '). 
The construction of live(Q) begins with the set of all processors, and removes a set 
of rk processors, one for each token. This set of removed processors includes the 
covered processors, but may include other processors as well. For example, suppose p , 
and Pi+i are covered with one token each in Q, but suppose pi is uncovered and Pi+i 
is covered by two tokens in Q' . For simplicity, let's assume these are the only tokens 
on the graphs. When constructing the set live(Q), we remove both pi and p i+1 since 
they are both covered. When constructing the set live{Q'), we remove Pi+i, but we 
must also remove a second processor corresponding to the second token covering p i+ i . 
Which processor should we remove? If we choose a low processor like p i , then we have 
changed the rank of a low processor like p2 from 2 to 1. If we choose a high processor 
like p n , then we have change the rank of a high processor like p n -i from n — 3 to n — 2. 
On the other hand, if we choose to remove p, again, then no processors change rank. In 
general, the construction of live(Q) considers each processor p in turn. If p is covered 
by m p tokens in Q, then the construction removes m p processors by starting with p, 
working down the list of remaining processors smaller than p, and then working up the 
list of processors larger than p if necessary. 

Specifically, given a graph Q, the multiplicity of p is the number m p of tokens ap- 
pearing on nodes for p in Q, and the multiplicity of Q is the vector m = (m Pl , . . . , m Pn ). 
Given the multiplicity of Q as input, the algorithm given in Figure 12 computes live(Q). 
In this algorithm, processor p, is denoted by its index i. We refer to the ith iteration 
of the main loop as the ith step of the construction. This construction has two obvious 
properties: 



7.1 Live processors 



25 



S<-{l,...,n} 
for each i = 1, . . . ,n 

count <r- 0 

for each j = i, i — 1, . . . , 1, i + 1, . . . , n 
if count = vrn then break 
if j G 5 then 
S <- S - {j} 
count <— count + 1 
Jive(0) <- S 

Figure 12: The construction of live(Q). 



Lemma 12: If i G live{Q) then 

1. « « uncovered: m; = 0 

2. room exwto under i: Y^j^i nij <i — 1 

Proof: Suppose j G Wve(<5). For part 1, if m; > 0 then « will be removed by step i if it 
has not already removed by an earlier step, contradicting i G live{Q) . For part 2, notice 
that steps 1 through i — 1 remove a total of m j va l ues - If this sum is greater 

than then it is not possible for all of these values to be contained in 1, . . . , i — 1, 
so i will be removed within the first i — 1 steps, contradicting i G live(Q). □ 

The assignment of graphs to the corners of a simplex has the property that once p 
becomes covered on one corner of 5, it remains covered on the following corners of S: 

Lemma 13: If p is uncovered in the graphs Qi and Qj , where i < j, then p is uncovered 
in each graph Qi,Q i+ i, . . . ,Qj. 

Proof: If p is covered in Qi for some I between i and j, then p is uncovered in Qt-i 
and covered in Qi for some I between i and j. Since Qi-\ and Qi are on adjacent 
vertices of the simplex, the sequences of graphs merged to construct them are of 
the form 7ii, . . . , % m , . . . , %k and 7ii, . . . , 'H' m , . . . , %k, respectively, for some m. 
Since p is uncovered in Qi-i and covered in Qi, it must be that p is uncovered in % m 
and covered in 'H' m . Notice, however, that W is used in the construction of each 
graph Qi, Gt+i,. ■ ■ ,Gj- This means that p is covered in each of these graphs, con- 
tradicting the fact that p is uncovered in Qj . □ 

Finally, because token placements in adjacent graphs on a simplex differ in at most 
the movement of one token from one processor to an adjacent processor, we can use 
the preceding lemma to prove the following: 

Lemma 14: If p G Hve{Qi) and p G live(Gj), then p has the same rank in live{Qi) 
and live(Qj). 
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Proof: Assume without loss of generality that i < j. Since p E live{Gi) and p E 
live(Qj), Lemma 12 implies thatp is uncovered in the graphs Gi and Gj, and Lemma 13 
implies that p is uncovered in each graph Gi, Gi+i,- . . , Gj- Since token placements in 
adjacent graphs differ in at most the movement of one token from one processor to 
an adjacent processor, and since p is uncovered in all of these graphs, this means that 
the number of tokens on processors smaller than p is the same in all of these graphs. 
Specifically, the sum YleZi m i °f multiplicities of processors smaller than p is the 
same in Gi, Gi+i, . . . ,Gj- In particular, Lemma 12 implies that this sum is the same 
value s < p — 1 in Gi and Gj, so p has the same rankp — s in live{Gi) and live(Gj). □ 

7.2 Processor labeling 

We now choose one processor from each set live(G) to label the vertex with graph G- 
Given a vertex x = (xi, . . . , Xk), we define 

k 

plane{x) = ^ X, (mod k + 1) 

t=i 

Lemma 15: If x and y are distinct vertices of the same simplex, then plane(x) ^ 
plane(y). 

Proof: Since x and y are in the same simplex, we can write y = x + fi + ■ ■ ■ + fj 
for some distinct unit vectors /i, . . . , fj and some 1 < j < k. If x = (xi,. . . , Xk) 
and y = (yi, . . . , y^), then the sums %i and Hi differ by exactly j. Since 

1 < j < k and since planes are defined as sums modulo k + 1, we have plane(x) ^ 
plane{y). □ 

We define a global processor labeling tt as follows: given a vertex x labeled with a 
graph G, we define tt to map x to the processor having rank plane(x) in live{G). 

Lemma 16: The mapping tt is a global processor labeling. 

Proof: First, it is clear that tt maps each vertex x labeled with a graph ^ to a pro- 
cessor q x that is uncovered in G x ■ Second, tt maps distinct vertices of a simplex to 
distinct processors. To see this, suppose that both x and y are labeled with p, and 
let G x and G y be the graphs labeling x and y. We know that the rank of p in live(G x ) 
is plane(x) and that the rank of p in live(G y ) is plane(y), and we know that p has the 
same rank in live{G x ) m\dlive{G y ) by Lemma 14. Consequently,/?Zane(a;) = plane(y), 
contradicting Lemma 15. □ 

We label the vertices of B with processors according to the processor labeling tt. 

8 Ordered Pair Assignment 

Finally, we assign ordered pairs (p, C) of processor ids and local communication graphs 
to vertices of B. Given a vertex x labeled with processor p and graph G, we label x 
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with the ordered pair (p, £) where £ is the local communication graph of p in Q. The 
following result is a direct consequence of Lemmas 11 and 16. It says that the local 
communication graphs of processors labeling the corners of a vertex are consistent with 
a single global communication graph. 

Lemma 17: Let q 0 , . . . , g& and C 0 , ■ ■ ■ , £k be the processors and local communication 
graphs labeling the vertices of a simplex. There is a global communication graph Q 
with the property that each g, is nonfaulty in Q and has the local communication 
graph d in Q. 

9 Sperner's Lemma 

We now state Sperner's Lemma, and use it to prove a lower bound on the number of 
rounds required to solve fc-set agreement. 

Notice that the corners of B are points of the form Cj = (TV, . . . , TV, 0, . . . , 0) with i 
indices of value TV for 0 < i < k. For example, Co = (0, . . . , 0), ci = (TV, 0, . . . , 0), 
and Ck = (TV, . . . , TV). Informally, a Sperner coloring of B assigns a color to each 
vertex so that each corner vertex a is given a distinct color Wi, each vertex on the edge 
between a and cj is given either Wi or wj, and so on. 

More formally, let 5 be a simplex and let F be a face of S. Any triangulation 
of S induces a triangulation of F in the obvious way. Let T be a triangulation of S. 
A Sperner coloring of T assigns a color to each vertex of T so that each corner of T 
has a distinct color, and so that the vertices contained in a face F are colored with the 
colors on the corners of F, for each face F of T. Sperner colorings have a remarkable 
property: at least one simplex in the triangulation must be given all possible colors. 

Lemma 18 (Sperner's Lemma): If B is a triangulation of a fc-simplex, then for any 
Sperner coloring of B, there exists at least one fc-simplex in B whose vertices are all 
given distinct colors. 

Let P be the protocol whose existence we assumed in the previous section. Define 
a coloring \p of B as follows. Given a vertex x labeled with processor p and local 
communication graph £, color x with the value v that P requires processor p to choose 
when its local communication graph is C. This coloring is clearly well-defined, since P 
is a protocol in which all processors chose an output value at the end of round r. We 
will now expand the argument sketched in the introduction to show that \ P is a Sperner 
coloring. 

We first prove a simple claim. Recall that B is the simplex whose vertices are the 
corner vertices cq, . . . , Ck, and that B is a triangulation of B. Let T be some face of B 
not containing the corner c,, and let F denote the triangulation of T induced by B. We 
prove the following technical statement about vertices in F . 

Claim 19: If x = (xi, . . . , Xk) is a vertex of a face F not containing Ci, then 

1. if i = 0, then xi = TV, 

2. if 0 < i < k, then Xi + i = Xi, and 
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3. if i = k, then xu = 0. 

Proof: Each vertex x of B can be expressed using barycentric coordinates with respect 

to the corner vertices: that is, x = a^co H h ctfcCfc, where 0 < ctj < 1 for 0 < j < k 

and 5ZjL 0 a, = 1- Since a; is a vertex of a face F not containing the corner Ci, it 
follows that cti = 0. We consider the three cases. 

CaseLi = 0. Each corner ci, c& has the value TV in the first position. Since ao = 0, 

the value in the first position of ctoco H + a^Cfc is (ai H + ak)N = N. 

Case 2: 0 < i < k. Each corner Co, . . . , Ci-i has 0 in positions i and i + 1, and 
each corner Cj+i, . . . , cj; has TV in positions « and « + 1. Since at = 0, the linear 
combination ctoco + • • • + ctfcCfc will have the same value (a^ + i + • • • + ak)N in 
positions i and i + 1. Thus, = Zj+i. 

Case 3: i = k. Each corner Co, . . . , Cfc_i has 0 in position fc. Since a& = 0, the value 
in the fcth position of a^co H + afcCfc is 0. Thus, Xk = 0. □ 

Lemma 20: If P is a protocol for fc-set agreement tolerating / faults and halting 
in r < [f /k\ rounds, then \p is a Sperner coloring of B. 

Proof: We must show that \p satisfies the two conditions of a Sperner coloring. 

For the first condition, consider any corner vertex c , . Remember that c, was origi- 
nally labeled with the 1 -graph Ti describing a failure-free execution in which all pro- 
cessors start with input V{, and that the local communication graph £ labeling a is a 
subgraph of Ti . Since the fc-set agreement problem requires that any value chosen by a 
processor must be an input value of some processor, all processors must chose v , in Ti, 
and it follows that the vertex c, must be colored with Vi. This means that each corner Cj 
is colored with a distinct value . 

For the second condition, consider any face F of B, and let us prove that vertices 
in F are colored with the colors on the corners of F. Equivalently, suppose that c i is 
not a corner of F, and let us prove that no vertex in F is colored with v i . 

Consider the global communication graph Q originally labeling x, and the graphs 
Hi, . . . , Tik used in the merge defining Q. The definition of this merge says that the 
input value labeling a node (p, 0) in Q is v m where m is the maximum m such that (p, 0} 
is labeled with v m in H m , or vo if no such m exists. Again, we consider three cases. 
In each case, we show that no processor in Q has the input value v 

Suppose 2 = 0. Since x\ = TV by Claim 19, we know that Jii = T\, where the 
input value of every processor is v i . By the definition of the merge operation, it follows 
immediately that no processor in Q can have input value v$. 

Suppose 1 < i < k. Again, x i+ i = x t by Claim 19. Now, Hi is the result 
of applying <r,, the first Xi operations of cr\vi\, to the graph T{-\. Similarly, Hi+i 
is the result of applying <Tj +1 , the first Zj+i operations of ct^+i], to the graph Ti- 
Since Xi + i = Xi, both <r, and <t, + i are of the same length, and it follows that <Tj con- 
tains an operation of the form change(p, Vi) if and only if Oi+x contains an operation 
of the form change(p, Uj+i). This implies that for any processor, either its input value 
is Vi-i in Hi and Vi in Wi+i, or its input value is Vi in "H, and in 7ii+i. In both 
cases, Vi is not the input value of this processor. 
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Suppose i = k. Since Xk = 0 by Claim 19, we know that T-Lk = J~k-i, where the 
input value of every processor is vu-i- By the definition of merge, it follows immedi- 
ately that no processor in Q can have input value Vf.- 

Therefore, we have shown that if a; is a vertex of a face F of B, and Cj is not a 
corner vertex of F, then the communication graph Q corresponding to x contains no 
processor with input value Therefore, by the agreement condition, the value chosen 
at this vertex cannot be Vi, and it follows that x is assigned a color other than v j. So, x 
must be colored by a color v j such that c, is a corner vertex of F. Since cj is colored vj , 
the second condition of Sperner's Lemma holds. So \p is a Sperner coloring. □ 

Sperner's Lemma guarantees that some primitive simplex is colored by k + 1 dis- 
tinct values, and this simplex corresponds to a global state in which fc + 1 processors 
choose k + 1 distinct values, contradicting the definition of fc-set agreement: 

Theorem 21: If n > f + k + 1, then no protocol for fc-set agreement can halt in fewer 
than lf/kj+1 rounds. 

Proof: Suppose P is a protocol for fc-set agreement tolerating / faults and halting 
inr < [f/k\ rounds, and consider the corresponding Bermuda Triangle B. Lemma 20 
says that Xp is a Sperner coloring of B, so Sperner's Lemma 18 says that there is a sim- 
plex S whose vertices are colored with k + 1 distinct values vq, . . . , Vk- Let q 0 , . . . , 
and £o> •••>■£* be the processors and local communication graphs labeling the corners 
of S. By Lemma 17, there exists a communication graph Q in which is nonfaulty 
and has local communication graph Li. This means that Q is a time r global commu- 
nication graph of P in which each g, must choose the value «j. In other words, k + 1 
processors must choose k + 1 distinct values, contradicting the fact that P solves fc-set 
agreement in r rounds. □ 

10 Protocol 

An optimal protocol P for fc-set agreement is given in Figure 13. In this protocol, 
processors repeatedly broadcast input values and keep track of the least input value 
received in a local variable best. Initially, a processor sets best to its own input value. 
In each of the next [f /k\ + 1 rounds, the processor broadcasts the value of best and 
then sets best to the smallest value received in that round from any processor (including 
itself). In the end, it chooses the value of best as its output value. 

To prove that P is an optimal protocol, we must prove that, in every execution 
of P, processors halt in r = \_f jk\ + 1 rounds, every processor's output value is some 
processor's input value, and the set of output values chosen has size at most fc. The first 
two statements follow immediately from the text of the protocol, so we need only prove 
the third. For each time t and processor p, let best p j be the value of best held by p at 
time t. For each time t, let Best(t) be the set of values best qi j, . . . , best qi<t where the 
processors qi , . . . , qt are the processors active through time t. Notice that Best (0) is 
the set of input values, and that Best(r) is the set of chosen output values. Our first 
observation is that the set Best(t) never increases from one round to the next. 

Lemma 22: Best(t) D Best(t + 1) for all times t. 
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best <r- input _value; 

for each round 1 through [f /k\ + 1 do 
broadcast best ; 

receive values bi, . . . , bi from other processors; 
best <r- min {bi, . . . , bi}; 

choose best. 

Figure 13: An optimal protocol P for fc-set agreement. 



Proof: If b G Best(t + 1), then b = best p j + i for some processor p active through 
round t + 1. Since best v< t+i is the minimum of the values b±,...,bi sent to p by 
processors during round t + 1, we know that 6 = best q j for some processor g that is 
active through round t. Consequently, b G Best{t). □ 

We can use this observation to prove that the only executions in which many output 
values are chosen are executions in which many processors fail. We say that a proces- 
sor p fails before time t if there is a processor q to which p sends no message in round t 
(and p may fail to send to q in earlier rounds as well). 

Lemma 23: If \Best (t) \ = d + 1, then at least dt processors fail before time t. 

Proof: We proceed by induction on t. The case of t = 0 is immediate, so suppose 
that t > 0 and that the induction hypothesis holds for t — 1. Since \Best(t)\ = d + 1 
and since Best(t) C Best(t - 1) by Lemma 22, it follows that \Best(t - 1)| > d + 1, 
and the induction hypothesis for t — 1 implies that there is a set 5 of d(t — 1) processors 
that fail before time t—1. It is enough to show that there are an additional d processors 
not contained in S that fail before time t. 

Let bo,...,bdbe the values of Best(t) written in increasing order. Let q be a pro- 
cessor with best q j set to the largest value bd at time t, and for each value 6, let g, be a 
processor that sent bi in round t — 1. The processors g 0 > ■ ■ ■ , Qd are distinct since the 
values bo,...,bd are distinct, and these processors do not fail before time t — 1 since 
they send a message in round t, so they are not contained in S. On the other hand, the 
processors q 0 , . . . , qd- i sending the small values 6 0 , ■ ■ ■ , bd- i in round t — 1 clearly did 
not send their values to the processor q setting best q j to the large value bd, or q would 
have set best q j to a smaller value. Consequently, these d processors q 0 , . . . , g^-i fail 
in round t and hence fail before time t. □ 

Since Best(r) is the set of output values chosen by processors at the end of round 
r = [f /k\ + 1, if k + 1 output values are chosen, then Lemma 23 says that at least kr 
processors fail, which is impossible since / < kr. Consequently, the set of output 
values chosen has size at most k, as we are done. 

Theorem 24: The protocol P solves fc-set agreement in |_/ /fcj + 1 rounds. 
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